The SED heuristic for morpheme discovery: a look at Swahili

نویسندگان

  • Yu Hu
  • Irina Matveeva
  • John Goldsmith
  • Colin Sprague
چکیده

This paper describes a heuristic for morphemeand morphology-learning based on string edit distance. Experiments with a 7,000 word corpus of Swahili, a language with a rich morphology, support the effectiveness of this approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Refining The SED Heuristic For Morpheme Discovery: Another Look At Swahili

This paper describes a heuristic for morphemeand morphology-learning based on string edit distance. Experiments with a 7,000 word corpus of Swahili, a language with a rich morphology, support the effectiveness of this approach.

متن کامل

A heuristic for morpheme discovery based on string edit distance

This paper derives from work we have been doing on unsupervised learning of the morphology of languages with rich morphologies, that is, with a high average number of morphemes per word. Our focus in this paper is Swahili, a major Bantu language of East Africa, and our goal is the development of a system that can automatically produce a morphological analyzer of a text on the basis of a large c...

متن کامل

Analyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche

Performance analysis of sub-word language modeling for under-resourced languages with rich morphology : case study on Swahili and Amharic This paper investigates the impact on ASR performance of sub-word units for two underresourced african languages with rich morphology (Amharic and Swahili). Two subword units are considered : syllable and morpheme, the latter being obtained in a supervised or...

متن کامل

Analyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche (Performance analysis of sub-word language modeling for under-resourced languages with rich morphology: case study on Swahili and Amharic) [in French]

Performance analysis of sub-word language modeling for under-resourced languages with rich morphology : case study on Swahili and Amharic This paper investigates the impact on ASR performance of sub-word units for two underresourced african languages with rich morphology (Amharic and Swahili). Two subword units are considered : syllable and morpheme, the latter being obtained in a supervised or...

متن کامل

The Discourse Function of Object Marking in Swahili

1. Introduction The Swahili object marker (OM) is a verbal prefix that agrees with an object of the verb. In Swahili there is no semantic or lexical class of objects for which object marking is obligatory, nor is there any class for which it is impossible. The numerous earlier studies of the object marker have discovered no hard and fast rules for its distribution. Its usage has been found to c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005